Convergence of Reinforcement Learning with General Function Approximators

نویسندگان

Vassilis A. Papavassiliou

Stuart J. Russell

چکیده

A key open problem in reinforcement learning is to assure convergence when using a compact hypothesis class to approximate the value function. Although the standard temporal-difference learning algorithm has been shown to converge when the hypothesis class is a linear combination of fixed basis functions, it may diverge with a general (nonlinear) hypothesis class. This paper describes the Bridge algorithm, a new method for reinforcement learning, and shows that it converges to an approximate global optimum for any agnostically learnable hypothesis class. Convergence is demonstrated on a simple example for which temporal-difference learning fails. Weak conditions are identified under which the Bridge algorithm converges for any hypothesis class. Finally, connections are made between the complexity of reinforcement learning and the PAC-learnability of the hypothesis class.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning with Echo State Networks

Function approximators are often used in reinforcement learning tasks with large or continuous state spaces. Artificial neural networks, among them recurrent neural networks are popular function approximators, especially in tasks where some kind of of memory is needed, like in real-world partially observable scenarios. However, convergence guarantees for such methods are rarely available. Here,...

متن کامل

Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding 1 Reinforcement Learning and Function Approximation 2 Good Convergence on Control Problems

On large problems, reinforcement learning systems must use parame-terized function approximators such as neural networks in order to generalize between similar situations and actions. In these cases there are no strong theoretical results on the accuracy of convergence, and computational results have been mixed. In particular, Boyan and Moore reported at last year's meeting a series of negative...

متن کامل

Sparse Distributed Memories in a Bounded Metric State Space: Some Theoretical and Empirical Results

Sparse Distributed Memories (SDM) [7] is a linear, local function approximation architecture that can be used to represent cost-to-go or stateaction value functions of reinforcement learning (RL) problems. It offers a possibility to reconcile the convergence guarantees of linear approximators and the potential to scale to higher dimensionality typically exclusive to nonlinear architectures. We ...

متن کامل

Convergence and Divergence in Standard and Averaging Reinforcement Learning

Although tabular reinforcement learning (RL) methods have been proved to converge to an optimal policy, the combination of particular conventional reinforcement learning techniques with function approximators can lead to divergence. In this paper we show why off-policy RL methods combined with linear function approximators can lead to divergence. Furthermore, we analyze two different types of u...

متن کامل

High-accuracy value-function approximation with neural networks applied to the acrobot

Several reinforcement-learning techniques have already been applied to the Acrobot control problem, using linear function approximators to estimate the value function. In this paper, we present experimental results obtained by using a feedforward neural network instead. The learning algorithm used was model-based continuous TD(λ). It generated an efficient controller, producing a high-accuracy ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Convergence of Reinforcement Learning with General Function Approximators

نویسندگان

چکیده

منابع مشابه

Reinforcement Learning with Echo State Networks

Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding 1 Reinforcement Learning and Function Approximation 2 Good Convergence on Control Problems

Sparse Distributed Memories in a Bounded Metric State Space: Some Theoretical and Empirical Results

Convergence and Divergence in Standard and Averaging Reinforcement Learning

High-accuracy value-function approximation with neural networks applied to the acrobot

عنوان ژورنال:

اشتراک گذاری